Deep Learning and Bidirectional Optical Flow Based Viewport Predictions for 360° Video Coding
نویسندگان
چکیده
The rapid development of virtual reality applications continues to urge better compression 360° videos owing the large volume content. These are typically converted 2-D formats using various projection techniques in order benefit from ad-hoc coding tools designed support conventional video compression. Although recently emerged standard, Versatile Video Coding (VVC) introduces specific tools, it fails prioritize user observed regions videos, represented by rectilinear images called viewports. This leads encoding redundant frames, escalating bit rate cost videos. In response this issue, paper proposes a novel framework for VVC which exploits viewport information alleviate pixel redundancy regard, bidirectional optical flow, Gaussian filter and Spherical Convolutional Neural Networks (Spherical CNN) deployed extract perceptual features predict By appropriately fusing predicted viewports on projected Regions Interest (ROI) aware weightmap is developed can be used mask source introduce adaptive changes Lagrange quantization parameters VVC. Comprehensive experiments conducted context Test Model (VTM) 7.0 show that proposed improve bitrate reduction, achieving an average saving 5.85% up 17.15% at same quality measured Viewport Peak Signal-To-Noise Ratio (VPSNR).
منابع مشابه
Viewport-Driven Rate-Distortion Optimized 360{\deg} Video Streaming
The growing popularity of virtual and augmented reality communications and 360◦ video streaming is moving video communication systems into much more dynamic and resource-limited operating settings. The enormous data volume of 360◦ videos requires an efficient use of network bandwidth to maintain the desired quality of experience for the end user. To this end, we propose a framework for viewport...
متن کاملOptical flow techniques applied to video coding
Motion estimation is an important part of most video coding schemes because it enables us to exploit the high degree of temporal redundancy present. Though block matching algorithms (BMA) yield coarse and piecewise-constant elds, they are very popular due to their simplicity and low bit overhead. In this paper, we propose to use a more advanced gradient-based technique to overcome the disadvant...
متن کاملDeep Predictive Coding Networks for Video Prediction and Unsupervised Learning
While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning — leveraging unlabeled examples to learn about the structure of a domain — remains a difficult unsolved challenge. Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the vi...
متن کاملViewport-aware adaptive 360° video streaming using tiles for virtual reality
360° video is attracting an increasing amount of attention in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional streaming services for 360° video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360° video and to provide an enhanced VR experience...
متن کاملSupplementary Material: Deep 360 Pilot: Learning a Deep Agent for Piloting through 360◦ Sports Videos
r(lt(i), l gt t ) = { 1− ‖lt(i)−l gt t ‖2 η , if ‖lt(i)− l gt t ‖2 <= η −1, otherwise (1) where η equals the distance from the center of a viewing angle to the corner of its corresponding NFoV, i.e., √ 32.752 + 24.562 = 40.9 if we define NFOV as spanning a horizontal angle of 65.5◦ with a 4 : 3 aspect ratio. When lt == l gt t , the reward is 1, which is the maximum reward. When ‖lt(i) − l t ‖2 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2022
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2022.3219861